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Abstract 

In this paper, we define the notion of 
a preventative expression and discuss a 
corpus study of such expressions in in- 
structional text. We discuss our cod- 
ing schema, which takes into account 
both form and function features, and 
present measures of inter-coder reliabil- 
ity for those features. We then discuss 
the correlations that exist between the 
function and the form features. 



1 Introduction 

While interpreting instructions, an agent is con- 
tinually faced with a number of possible actions 
to execute, the majority of which are not appro- 
priate for the situation at hand. An instructor is 
therefore required not only to prescribe the ap- 
propriate actions to the reader, but also to pre- 
vent the reader from executing the inappropriate 
and potentially dangerous alternatives. The first 
task, which is commonly achieved by giving simple 
imperative commands and statements of purpose, 
has received considerable attention in both the 
interpretation (e.g., jDi Eugenio, 1993)) and the 
generation communities (e.g., (Vander Linden and 
Martin, 1995)). The second, achieved through the 



use of preventative expressions, has received con- 
siderably less attention. Such expressions can in- 
dicate actions that the agent should not perform, 
or manners of execution that the agent should not 
adopt. An agent may be told, for example, "Do 
not enter" or "Take care not to push too hard" . 
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Both of the examples just given involve negation 
("do not" and "take care not"). Although this is 
not strictly necessary for preventative expressions 
(e.g., one might say "stay out" rather than "do 
not enter"), we will focus on the use of negative 
forms in this paper. We will use the following 
categorisation of explicit preventative expressions: 

• negative imperatives proper (termed DONT 
imperatives). These are characterised by the 
negative auxiliary do not or don't. 

(1) Your sheet vinyl floor may be vinyl as- 
bestos, which is no longer on the mar- 
ket. Don't sand it or tear it up 

because this will put dangerous asbestos 
fibers into the air. 

• other preventative imperatives (termed neg- 
TC imperatives). These include take care 
and be careful followed by a negative infiniti- 
val complement, as in the following examples: 

(2) To book the strip, fold the bottom third 
or more of the strip over the middle of 
the panel, pasted sides together, taking 
care not to crease the wallpaper 
sharply at the fold. 

(3) If your plans call for replacing the wood 
base molding with vinyl cove molding, be 
careful not to damage the walls as 

you remove the wood base. 

The question of interest for us is under which 
conditions one or the other of the surface forms is 
chosen. We are currently using this information 
to drive the generation of warning messages in the 



drafter system ( Vander Linden and Di Eugenio 



1996). We will start by discussing previous work 



on negative imperatives, and by presenting an hy- 
pothesis to be explored. We will then describe 
the nature of our corpus and our coding schema, 



detailing the results of our inter-coder reliability 
tests. Finally, we will describe the results of our 
analysis of the correlation between function and 
form features. 



2 Related work on Negative 
Imperatives 

While instructional text has sparked much inter- 
est in both the semantics/pragmatics community 
and the computational linguistics community, lit- 
tle work on preventative expressions, and in par- 
ticular on negative imperatives, has been done. 
This lack of interest in the two communities has 
been in some sense complementary. 

In semantics and pragmatics, negati on has been 
extensively studied (cf. Horn ( 1989 )). Impera- 
tives, on the other ha nd, ha ve not (for a notable 
exception, see Davies ( 1986 )). 

In computational linguistics, on the other hand, 
positive imperatives have been extensively inves- 
tigated, both from the point of view of interpre- 
tation ( Vere and Bickmorc, 1990; Altcrman et al 



1991 


Chapman, 1991; 


Di Eugcnio, 1993 


) and gen- 


eration ( 


Mcllish and Evans, 198E; McKeown ct 


al., 1990 


; Paris et al., 1995; jVander Linden and 



• Neg-TC imperatives. In general, neg-TC 
imperatives are used when S expects H to 
overlook a certain choice point; such choice 
point may be identified through a possible 
side effect that the wrong choice will cause. 
It may, for example, be used when H might 
execute an action in an undesirable way. Con- 
sider: 

(5) To make a piercing cut, first drill a hole 
in the waste stock on the interior of the 
pattern. If you want to save the waste 
stock for later use, drill the hole near a 
corner in the pattern. Be careful not 
to drill through the pattern line. 

Here, H has some choices as regards the exact 
position where to drill, so S constrains him 
by saying Be careful not to drill through the 
pattern line. 

So the hypothesis is that H's awareness of the 
presence of a certain choice point in executing a 
set of instructions affects the choice of one preven- 
tative expression over another. This hypothesis, 
however, was based on a small corpus and on intu- 
itions. In this paper we present a more systematic 
analysis. 



Martin, 1995|). Little work, however, has been di- 4 Corpus and coding 



rected at negative imperatives, (for exceptions see 



the work of Vere and Bickmore (1990) in interpre- 
tation and of Ansari (1995[ ) in generation). 



3 A Priori Hypotheses 



Di Eugenio (1993) put forward the following hy- 
pothesis concerning the realization of preventative 
expressions. In this discussion, S refers to the in- 
structor (speaker / writer) who is referred to with 
feminine pronouns, and H to the agent (hearer / 
reader), referred to with masculine pronouns: 

• DONT imperatives. A DONT imperative 
is used when S expects H to be aware of a cer- 
tain choice point, but to be likely to choose 
the wrong alternative among many — possi- 
bly infinite — ones, as in: 

(4) Dust-mop or vacuum your parquet floor 
as you would carpeting. Do not scrub 
or wet-mop the parquet. 

Here, H is aware of the choice of various clean- 
ing methods, but may choose an inappropri- 
ate one (i.e., scrubbing or wet-mopping). 



Our interest is in finding correlations between fea- 
tures related to the function of a preventative ex- 
pression, and those related to the form of that ex- 
pression. Functional features are the semantic fea- 
tures of the message being expressed and the prag- 
matic features of the context of communication. 
The form feature is the grammatical structure of 
the expression. In this section we will start with a 
discussion of our corpus, and then detail the func- 
tion and form features that we have coded. We 
will conclude with a discussion of the inter-coder 
reliability of our coding. 

4.1 Corpus 

The raw instructional corpus from which we take 
all the examples we have coded has been collected 
opportunistically off the internet and from other 
sources. It is approximately 4 MB in size and 
is made entirely of written English instructional 
texts. The corpus includes a collection of recipes 
( 1.7 MB), two complete do-it-yourself ma nuals 
( |RD 1991| ; pVIcGowan and R. DuBcrn, 199l| ) (1.2 
MB)[J a set of computer games instructions, the 
Sun Open-windows on-line instructions, and a col- 
lection of administrative application forms. As a 

1 These do-it-yourself manuals were scanned by 
Joseph Rosenzweig. 



collection, these texts are the result of a variety of 
authors working in a variety of instructional con- 
texts. 

We broke the corpus texts into expressions us- 
ing a simple sentence breaking algorithm and then 
collected the negative imperatives by probing for 
expressions that contain the grammatical forms 
we were interested in (e.g., expressions containing 
phrases such as "don't" and "take care"). The 
first row in Table ^ shows the frequency of occur- 
rence for each of the grammatical forms we probed 
for. These grammatical forms, 1175 occurrences 
in all, constitute 2.5% of the expressions in the full 
corpus. We then filtered the results of this probe 
in two ways: 

1. When the probe returned more than 100 ex- 
amples for a grammatical form, we randomly 
selected around 100 of those returned. We 
took all the examples for those forms that re- 
turned fewer than 100 examples. The number 
of examples that resulted is shown in row 2 
of Table (labelled "raw sample"). 

2. We removed those examples that, although 
they contained the desired lexical string, did 
not constitute negative imperatives. This 
pruning was done when the example was not 
an imperative (e.g., "If you don't see the 
Mail Tool window . . . " ) and when the exam- 
ple was not negative (e.g., "Make sure to lock 
the bit tightly in the collar."). The number 
of examples which resulted is shown in row 
3 of Table | (labelled "final coding"). Note 
that the majority of the "make sure" exam- 
ples were removed here because they were en- 
surative. 

As shown in Table |l|, the final corpus sample is 
made up of 239 examples, all of which have been 
coded for the features to be discussed in the next 
two sections. 

4.2 Form 

Because of its syntactic nature, the form feature 
coding was very robust. The possible feature val- 
ues were: DONT — for the do not and don't 
forms discussed above; and neg-TC — for take 
care, make sure, ensure, be careful, be sure, be 
certain expressions with negative arguments. 

4.3 Function Features 

The design of semantic/pragmatic features usu- 
ally requires a series of iterations and modifica- 
tions. We will discuss our schema, explaining the 
reasons behind our choices when necessary. We 



coded for two function features: intentional- 
ity and awareness, which we will illustrate in 
turn using a to refer to the negated action. The 
conception of these features was inspired by the 
hypothesis put forward in Section ^, as we will 
briefly discuss below. 

4.3.1 Intentionality 

This feature encodes whether the agent con- 
sciously adopts the intention of performing a. 
We settled on two values, CON(scious) and 
UNC(onscious). As the names of these values may 
be slightly misleading, we discuss them in detail 
here: 

CON is used to code situations where S expects 
H to intend to perform a. This often happens 
when S expects H to be aware that a is an 
alternative to the (3 H should perform, and to 
consider them equivalent, while S knows that 
this is not the case. Consider Ex. (||) above. 
If the negative imperative Do not scrub or 
wet-mop the parquet were not included, the 
agent might have chosen to scrub or wet-mop 
because these actions may result in deeper 
cleaning, and because he was unaware of the 
bad consequences. 

UNC is perhaps a less felicitous name because 
we certainly don't mean that the agent 
may perform actions while being unconscious! 
Rather, we mean that the agent doesn't re- 
alise that there is a choice point It is used in 
two situations: when a is totally accidental, 
as in: 

(6) Be careful not to burn the garlic. 

In the domain of cooking, no agent would 
consciously burn the garlic. Alternatively, an 
example is coded as UNC when a has to be 
intentionally planned for, but the agent may 
not take into account a crucial feature of a, 
as in: 

(7) Don't charge - or store - a tool where 
the temperature is below 40 degrees F or 
above 105 degrees. 

While clearly the agent will have to intend to 
perform charging or storing a tool, he is likely 
to overlook, at least in S's conception, that 
temperature could have a negative impact on 
the results of such actions. 

4.3.2 Awareness 

This binary feature captures whether the agent 
is AWare or UNAWare that the consequences of a 
are bad. These features arc detailed now: 





DONT 


Neg-TC 




don't 


do not 


take care 


make sure 


be careful 


be sure 


XVd-W vjlcp 


417 


385 


21 


229 


52 


71 




100 


99 


21 


104 


52 


71 


Final Coding 


78 


89 


17 


3 


46 


6 




167 


72 



Table 1 : Distribution of negative imperatives 



UNAW is used when H is perceived to be un- 
aware that a is bad. For example, Exam- 
ple (0) ("Don't charge - or store - a tool 
where the temperature is below 40 degrees F 
or above 105 degrees") is coded as UNAW be- 
cause it is unlikely that the reader will know 
about this restriction; 

AW is used when H is aware that a is bad. Ex- 
ample ("Be careful not to burn the gar- 
lic") is coded as AW because the reader is 
well aware that burning things when cooking 
them is bad. 

4.4 Inter-coder reliability 

Each author independently coded each of the fea- 
tures for all the examples in the sample. The per- 
centage agreement is 76.1% for intentionality and 
92.5% for awareness. Until very recently, these 
values would most likely have been accepted as 
a basis for further analysis. To support a more 
rigorous analysis, howev er, we have followed Car- 
l etta's suggestion (1996) of using the K coefficient 
( giegel and Castellan, 1988 ) as a measure of coder 
agreement. This statistic not only measures agree- 
ment, but also factors out chance agreement, and 
is used for nominal (or categorical) scales. In nom- 
inal scales, there is no relation between the differ- 
ent categories, and classification induces equiva- 
lence classes on the set of classified objects. In our 
coding schema, each feature determines a nominal 
scale on its own. Thus, we report the values of the 
K statistics for each feature we coded for. 

If P(A) is the proportion of times the coders 
agree, and P(E) is the proportion of times that 
coders are expected to agree by chance, K is com- 
puted as follows: 



K 



P(A) - P{E) 
1 - P{E) 



Thus, if there is total agreement among the 
coders, K will be 1; if there is no agreement 
other than chance agreement, K will be 0. There 
are various ways of computing P(E); according 
to Siegel and Castellan (1988] ), most researchers 



Kappa Value 


Reliability Level 


.00 - .20 


slight 


.21 - .40 


fair 


.41 - .60 


moderate 


.61 - .80 


substantial 


.81 - 1.00 


almost perfect 



Table 2: The Kappa Statistic and Inter-coder Re- 
liability 



feature 


K 


INTENTIONALITY 
AWARENESS 


0.51 
0.75 



Table 3: Kappa values for function features 



agree on the following formula, which we also 
adopted: 



P (E) = £ 



Pi 



where m is the number of categories, and pj is the 
proportion of objects assigned to category j. 

The mere fact that K may have a value 
k greater than zero is not sufficient to draw 
any conclusion, though, as it must be estab- 
lished whether k is significantly different from 
zero. While |5iegel and Castellan (1988| , p.289) 
point out that it is possible to check the sig- 
nificance of K when the number of objects 
is large, Rietveld and van Hout (1993D suggest a 
much simpler correlation between K values and 
inter-coder reliability, shown in Figure |^. 

For the form feature, the Kappa value is 1.0, 
which is not surprising given its syntactic nature. 
The function features, which are more subjec- 
tive in nature, engender more disagreement among 
coders, as shown by the K values in Table Ac- 
cording to Rietveld and van Hout, the awareness 
feature shows "substantial" agreement and the in- 
tentionality feature shows "moderate" agreement. 

5 Analysis 

In our analysis, we have attempted to discover 
and to empirically verify correlations between the 



feature 


X 


significance level 




Conscious 


Unconscious 


1 OlCLl 


intentionality 


51.4 


0.001 


DONT 


R1 f A 1 


4t> (£SJ 


i 


awareness 


56.9 


0.001 


neg-TC 


0(C) 


59 (D) 


59 


Total 


61 


104 


165 (N) 



Table 4: % 2 statistic and significance levels 



function features and the form feature. We did 
this by computing \ 2 statistics for the various 
functional features as they compared with form 
distinction between DONT and neg-TC impera- 
tives. Given that the features were all two-valued 
we were able to use the following definition of the 
statistic, taken from (siegel and Castellan, 1988): 



X 



N{\AD-BC\ - f ) 2 
{A + B)(C + D)(A + C)(B + D) 



Here N is the total number of examples and A-D 
are the values of the elements of the 2x2 con- 
tingency table (see Figure ||). The x 2 statistic 
is appropriate for the correlation of two indepen- 
dent samples of nominally coded data, and this 
particular definition of it is in line with Siegel's 
recommendations for 2x2 contingency tables in 
which N > 40 ( |5iegel and Castellan, 198Sj , page 
123). Concerning the assumption of indepen- 
dence, while it is, in fact, possible that some of 
the examples may have been written by a single 
author, the corpus was written by a considerable 
number of authors. Even the larger works (e.g., 
the cookbooks and the do-it-yourself manuals) are 
collections of the work of multiple authors. We felt 
it acceptable, therefore, to view the examples as 
independent and use the \ 2 statistic. 

To compute \ 2 f° r the coded examples in our 
corpus, we collected all the examples for which 
we agreed on both of the functional features (i.e., 
intentionality and awareness). Of the 239 total 
examples, 165 met this criteria. Table [| lists the 
X 2 statistic and its related level of significance for 
each of the features. The significance levels for in- 
tentionality and awareness indicate that the fea- 
tures do correlate with the forms. We will focus 
on these features in the remainder of this section. 

The 2x2 contingency table from which the in- 
tentionality value was derived is shown in Ta- 
ble [| This table shows the frequencies of exam- 
ples marked as conscious or unconscious in rela- 
tion to those marked as DONT and neg-TC. A 
strong tendency is indicated to prevent actions 
the reader is likely to consciously execute using 
the DONT form. Note that the table entry for 
conscious/neg-TC is 0, indicating that there were 
no examples marked as both CON and neg-TC. 
Similarly, the neg-TC form is more likely to be 



Table 5: Contingency Table for Intentionality 





Aware 


Unaware 


Total 


DONT 


3 


103 


106 


neg-TC 


32 


27 


59 


Total 


35 


130 


165 



Table 6: Contingency Table for Awareness 



used to prevent actions the reader is likely to ex- 
ecute unconsciously. 

In Section || we speculated that the hearer's 
awareness of the choice point, or more accurately, 
the writer's view of the hearer's awareness, would 
affect the appropriate form of expression of the 
preventative expression. In our coding, awareness 
was then shifted to awareness of bad consequences 
rather than of choices per se. However, the basic 
intuition that awareness plays a role in the choice 
of surface form is supported, as the contingency 
table for this feature in Table |^ shows. It indi- 
cates a strong preference for the use of the DONT 
form when the reader is presumed to be unaware 
of the negative consequences of the action to be 
prevented, the reverse being true for the use of the 
neg-TC form. 

The results of this analysis, therefore, demon- 
strate that the intentionality and awareness fea- 
tures do co-vary with grammatical form, and in 
particular, support a form of the hypothesis put 
forward in Section ||. 

6 Application 

We have successfully used the correlations dis- 
cussed here to support the generation of warning 



messages in the drafter project (Paris and Van 
der Linden, 1996). drafter is a technical author- 
ing support tool which generates instructions for 
graphical interfaces. It allows its users to spec- 
ify a procedure to be expressed in instructional 
form, and in particular, allows them to specify ac- 
tions which must be prevented at the appropriate 
points in the procedure. At generation time, then, 
drafter must be able to select the appropriate 
grammatical form for the preventative expression. 

We have used the correlations discussed in this 
paper to build the text planning rules required 
to generate negative imperatives. This is dis- 



cussed in more detail elsewhere (Vander Linden 
and Di Eugcnio, 1996j ), but in short, we input our 



coded examples to Quinlan's C4.5 learning algo- 



rithm (Quinlan, 1993), which induces a decision 
tree mapping from the functional features to the 
appropriate form. Currently, these features are 
set manually by the user as they are too difficult 
to derive automatically. 

7 Conclusions 

This paper has detailed a corpus study of pre- 
ventative expressions in instructional text. The 
study highlighted correlations between functional 
features and grammatical form, the sort of corre- 
lations useful in both interpretation and genera- 
tion. Studies such as this have been done before 
in Computational Linguistics, although not, to 
our knowledge, on preventative expressions. The 
point we want to emphasise here is a methodolog- 
ical one. Only recently have studies been making 
use of more rigorous statistical measures of accu- 
racy and reproducibility used here. We have found 
the Kappa statistic critical in the definition of the 
features we coded (see Section 4.4). 

We intend to augment and refine the list of fea- 
tures discussed here and hope to use them in un- 
derstanding applications as well as generation ap- 
plications. We also intend to extend the analysis 
to ensurative expressions. 
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